ORIENTATION DETERMINATION FROM CRYO-EM IMAGES 
USING LEAST UNSQUARED DEVIATION 
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Abstract. A major challenge in single particle reconstruction from cryo electron-microscopy is 
to establish a reliable ab-initio 3D model using 2D images with unknown orientations. Common-lines 
based methods can determine the orientations of images without additional geometric information. 
However, such methods fail when the detection rate of common-lines is too low due to the high 
level of noise in the images. An approximation to the least squares global self consistency error 
was obtained in [29j using convex relaxation by semidefinite programming. The purpose of this 
paper is three-fold. First, we introduce another, yet more robust, global self consistency error in an 
optimization problem that can be solved via semidefinite relaxation. Second, we introduce a spectral 
norm constraint to robustify the relaxed problem. Third, we use alternating direction method or the 
iteratively reweighted least squares (IRLS) procedure to solve the problem. Numerical experiments 
demonstrate that the proposed method significantly decreases the orientation estimation error when 
the detection rate of common-lines is low. 

1. Introduction. In single particle analysis, cryo-electron microscopy (cryo- 
EM) is used to attain a resolution sufficient to interpret fine details in three-dimensional 
(3D) macromolecular structures [HI [SB QUI [35] ■ Cryo-EM is used to acquire 2D pro- 
jection images of thousands of individual, identical frozen-hydrated macromolecules 
at random unknown orientations and positions. The collected images are extremely 
noisy due to the limited electron dose used for imaging to avoid excessive beam dam- 
age. In addition, the unknown orientational information of the imaged particles need 
to be estimated for 3D reconstruction. An ab-initio estimation of the orientations of 
images using the random-conical tilt technique |24| or common-lines based approaches 
[35] 128] 129] are often applied after multivariate statistical analysis [13j [36] and clas- 
sification techniques [34], [21] [30] that are used to sort and partition the large set of 
images by their viewing directions, producing "class averages" of enhanced signal-to- 
noise ratio (SNR). Using the ab-initio estimation of the orientations, a preliminary 
3D map is reconstructed from the images by a 3D reconstruction algorithm. The 
initial model is then iteratively refined [20] in order to obtain a higher-resolution 3D 
reconstruction. 

The Fourier projection-slice theorem (page 11 in |18| ) plays a fundamental role 
in the common-lines based reconstruction methods. The theorem states that a slice 
extracted from the frequency domain representation of a 3D volume yields the Fourier 
transform of a 2D projection of the volume in a direction perpendicular to the slice 
(Figure [Ll) . Thus, any two projections imaged from different viewing angles will 
intersect at a line in Fourier space, which is called the common-line between the two 
images. The common-lines between any three images with different projection di- 
rections determine their relative orientation up to handedness. This is the basis of 
the "angular reconstitution" technique of van Heel [35] , which was also developed 
independently by Vainshtein and Goncharov [33J . This technique uses a brute force 
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Fig. 1.1: Fourier projection-slice theorem. In the middle, p is a polar Fourier 
transform of projection Pj on the left. The red line represents the direction of a 
common-line between Pj and Pj on Pj. On the right, the two transformed images 
P and Pj intersect with each other at the common- line after rotations i?j and Rj, 



yielding the equation (3.3) 



approach to include additional projections. Farrow and Ottensmeyer [7J used quater- 
nions to obtain the relative orientation of a new projection in a least square sense. 
The main problem with such techniques is that they are sensitive to false detection 
of common lines that leads to the accumulation of errors. Penczek et.al. [33] tried to 
obtain the rotations corresponding to all projections simultaneously by minimizing a 
global energy functional, which requires a brute force search in an exponentially large 
parametric space of all possible orientations for all projections. Mallicket. al. [Tfj] and 
Singer et al. [28] applied Bayesian approaches to use common-lines information from 
different groups of projections. Recently, Singer et. al. [29] developed two algorithms 
based on eigenvectors and semidefinite programming for estimating the orientations of 
all images. These correspond to convex relaxations of the global self-consistency error 
minimization. The two algorithms accurately estimate all orientations at relatively 
low common-line detection rates. 

When the signal-to-noise ratio (SNR) of the image is significantly low, the de- 
tected common-lines consist of a modest number of noisy inliers, which are explained 
well by the image orientations, along with a large number of outliers, that have no 
structure. The standard common-lines based methods, including those using the least 
squares (LS) [71115]) are sensitive to these outliers. Here we propose to estimate the 
orientations by minimizing a different, more robust self consistency error, which is 
the sum of unsquared residuals [191 132j , rather than the sum of squared residuals in 
the LS formulation. The minimization problem is solved via semidefinite relaxation. 
When the detection rate of common- lines is extremely low, a spectral norm constraint 
is added to robustify the result. The minimization problem can be either solved by 
alternating direction method (ADM) or the iteratively reweighted LS (IRLS) proce- 
dure. 
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2. Detection of common-lines between images. Typically, the first step for 
detecting common lines is to compute the 2D Fourier transform of each image on a 
polar grid using, e.g., the non-uniform fast Fourier transform (NUFFT) |5J [HI IT2~] . 
The transformed images have resolution n r in the radial direction and resolution ng 
in the angular direction, that is, the radial resolution n r is the number of equi-spaced 
samples along each ray in the radial direction, and the angular resolution ng is the 
number of angularly equis-spaced Fourier rays computed for each image (Figure 
For simplicity, we let ng be an even number. The transformed images are denoted 
as (jfc, . . . , ^n 9 -ij > where = (/^.d l^ 2 , ... , /„,„ r ) is an n r dimensional vector, 
me {0, 1, . . . , ng — 1} is the index of a ray, k 6 {1, 2, . . . , K} is the index of an image 
and K is the number of images. The DC term is shared by all lines independently of 
the image, and is therefore excluded for comparison. To determine the common line 
between two images Pi and Pj, the similarity between all ng radial lines Iq, l\, . . . , 
from the first image with all ng radial lines l , l\, . . . , l n _ t from the second image 

are measured (overall ng comparisons) , and the pair of radial lines l l mi . and V m . . 
with the highest similarity are identified as the common-line pair between the two 
images. However, as a radial line is the complex conjugate of its antipodal line, the 
similarity measure between l l mi and P m2 has the same value as that between their 

antipodal lines It , ,„ and P , ,„ (where addition of indices is taken modulo ng). 

r m-i+ng/2 m 2 +n e /2 \ u l 

Thus the number of distinct similarity measures that need to be computed is ng/2 
obtained by restricting the index mi to take values between and ng/2 and letting 
TO2 take any of the ng possibilities (see also [35] and [22 , p. 255). Equivalently, it 
is possible to compare real valued ID line projections of the 2D projection images, 
instead of comparing radial Fourier lines that are complex valued. According to the 
Fourier projection-slice theorem, each ID projection is obtained by the inverse Fourier 
transform of the corresponding Fourier radial line P^ and its antipodal line ^ l+n9 / 2 > 
and is denoted as s^. The ID projection lines of a cryo-EM image can be displayed 
as a 2D image known as a "sinogram" (see [551 12T>] ). 

Traditionally, the pair of radial lines (or sinogram lines) that has the maximum 
normalized cross correlation is declared as the common line, that is, 

/ P P \ 

(m,ij,mj,i) = argmax — ^ , forallt^j, (2-1) 
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where niij is a discrete estimate for where the j'th image intersects with the i'th 
image. In practice, a weighted correlation, which is equivalent to applying a combi- 
nation of high-pass and low-pass filters is used to determine proximity. As noted in 
[35) , the normalization is performed so that the correlation coefficient becomes a more 
reliable measure of similarity between radial lines. Note that even with clean images, 
this estimate will have a small deviation from its ground truth (unknown) value due 
to discretization errors. With noisy images, large deviations of the estimates from 
their true values (say, errors of more than 10°) are frequent, and their frequency in- 
creases with the level of noise. We refer to common lines whose m^j and rrij^ values 
were estimated accurately (up to a given discretization error tolerance) as "correctly 
detected" common lines, or "inliers" and to the remaining common lines as "falsely 
detected", or "outliers". 

3. The weighted LS and the least unsquared deviation (LUD). We de- 
fine the directions of detected common-lines between the transformed image i and 



transformed image j as unit vectors (Figure |1.1[ ) 



(cos (2irm,ij /rig) , sin (2irmij/rio)) , 
(cos (2irmji/ng) , sin {2'Kmji/ne)) , 



(3.1) 
(3.2) 



where and Cji are on the transformed images i and J respectively, and my and 



are discrete estimate for the common lines' positions using (2.1 1. Let the rotation 



matrices Ri <E SO(3), i — ,K represent the orientations of the K images. 

According to the Fourier projection-slice theorem, the common lines on every two 
images should be the same after the 2D transformed images are inserted in the 3D 
Fourier space using the corresponding rotation matrices, that is, 



c ■ 





— Ri 
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for 1 < i < j < K. 



(3.3) 



These can be viewed as 



K \ . 

2 J linear equations for the 6K variables corresponding 
to the first two columns of the rotation matrices (the third column of each rotation 



matrix does not contribute in (3.3) due to the zero third entries in the common-line 
vectors in K 3 ). The weighted LS approach for solving this system can be formulated 
as the minimization problem 



Ki, 



min } Wi 

A(£SO(3)^ ' 



Ri(cij,0) -Rj(cji,0Y 



(3.4) 



where the weights Wij indicates the confidence in the detections of common-lines be- 
tween pairs of images. Since (c^, 0) T and (cji, 0) T are 3D unit vectors, their rotations 



are also unit vectors; that is, 



Ri (^,0) 



Rj (cji,0) 



1. It follows that the 



minimization problem (3.4) is equivalent to the maximization problem of the sum of 
dot products 



Hi, 



max w v ( Ri °) T > R j ^ji, 0) T ) 



(3.5) 



subject to the constraints (4.1). When the weight = 1 for each pair i ^ j, (3.5) 



is equivalent to the LS problem. The LS approach was considered in [23], and more 
recently in |29j using convex relaxation of the non-convex constraint set. The least 
squares approach may not be optimal however in our case due to the typically large 



proportion of outliers (Figure 3.1). 

To guard the orientation estimations against outliers, we replace the sum of 



weighted squared residuals in (3.4) with the more robust sum of unsquared resid- 
uals and obtain 



Ri, 



mm Vj .Rj (cj ,■ , 0) - Rj (cj h 0) 
,R x eSO(3)f--f II 



(3.6) 



or equivalently, 



Ri, 



1^3 



(3j,0) T -RjRjicjuOf 



(3.7) 
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(b) SNR=l/64 
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Fig. 3.1: Left and Middle: The histogram plots of errors in the detected common- 



lines Cij for all i and j, i.e. 



where Ri is a true rotation 



Ri(cij,0) -Rj(cji,0) 

matrix for all i. Right: elucidating the different between the squared distance and 
the absolute deviation. 



and we refer to this problem as least unsquared deviation (LUD) problem. The self 



result from outliers (Figure 3Tlc|) r 



consistency error given in (3.6| reduces the contribution from large residuals that may 



4. Semidefinite Programming Relaxation (SDR) and the Rounding 
procedure. However, both the weighted LS problem (13.41) and the LUD problem 



(3.6) are non-convex and therefore extremely difficult to solve if one requires the 
matrices Ri to be rotations, that is, when adding the constraints 

RiRj = 7 3 , det (ik) = 1 for i = 1, . . . , K, (4.1) 

where J3 is the 3x3 identity matrix. A relaxation method that neglects the constraints 



(4.1) will simply collapse to the trivial solution R\ — . . . = Rk = which obviously 



does not satisfy the constraint (4.1) 



The relaxation in [53] that uses semidefinite programming (SDP) can be modified 
in a straightforward manner in order to deal with non-unity weights ro,j in (3.5). 



We present this modification here for two reasons. The first reason is making the 
exposition as self contained as possible. The second reason is that the rounding 
procedure after SDP is slightly different than the one presented in [29] and is closer 
in spirit to the rounding procedure of Goemans and Williamson for the MAX-CUT 
problem [TT] . 

4.1. Constructing the Gram matrix G from the rotations Ri. We denote 
the columns of the rotation matrix Ri by R\ , i?| , and i?f , and write the rotation 
matrices as 

I 1 1 1 l 
Ri=\ Rj R? Rf I ,i = l,...,K. 

V I I I 

We define a 3 x 2K matrix R given by 

( 1 1 1 
R = I R{ Rj ■■■ Rl R\ ■■■ R} K R\ \ ■ (4.2) 



focus on analyzing (3.6 I since 



1 The weighted version of (3.6 I is min^ t ... t R K gso(3) Ei^j w ij (cij,0) T - Rj (cjj, 0) T . We 
the analysis for its weighted version is similar 



The Gram matrix G for the matrix R is a 2K x 2K matrix of inner products between 
the 3D column vectors of R, that is, 



G = R T R. (4.3) 

Clearly, G is a rank-3 scmidcfinitc positive matrix (G )p 0), which can be conveniently 
written as a block matrix 

G = ( G ij)i,j I.....K j 

where Gij is the 2x2 upper left block of the rotation matrix RfRj, that is, 

The orthogonality of the rotation matrices (R[Ri = I) implies that 

Gu=I 2 , i= 1,2,- ■■ ,K, (4.4) 

where I2 is a 2 x 2 identity matrix. 

4.2. SDR for weighted LS. We first define two 2K x 2K matrices S — 
0%')i,j=i,~ ,K an d W = {Wij)i^ = i } ... } Ki where the 2x2 sub-blocks Sij and Wij 
are given by 

Sij = cJiQj and ^ = Wjj f J J 

Both the matrices S and are symmetric and they stores all available common-line 
information and weight information respectively. It follows that the objective function 
(3.5) is the trace of the matrix (/3 o S) G: 

J2 (Si. °) T . fi i °) T ) = trace (( W °S)G) : (4.5) 

where the symbol o denotes the Hadamard product between two matrices. A natural 
relaxation of the optimization problem (3.5) is thus given by the SDP 

max trace ((W o S) G) (4.6) 

G( Z R 2KX2K 

s.t. G U =I 2 , i= 1,2,... ,K (4.7) 

G fc= (4.8) 

The only constraints missing in this SDP formulation is the non-convex rank-3 con- 
straint on the Gram matrix G, which is referred as a semidefinite relaxation (SDR) 
[15] . The problem (4.6)-(4.8) is a standard SDP problem, and it can be well solved by 
the solver SDPLR 3 which takes advantage of the low-rank property of G. SDPLR is 
a first-order algorithm via low-rank factorization and hence can provide approximate 
solutions for large scale problems. Moreover, the iterations of SDPLR are extremely 
fast. 
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4.3. SDR for LUD. Similar to defining the Gram matrix G in (4.3), we define 
a 3K x 3K matrix G as G = (Gy)i,j=i,... Ki where the 3x3 blocks = RfRj. 
Then a natural SDR for (|3~7l is 



Gij (c 3 i,0) 



, s.t. Gi,; — I3 



(4.9) 



The constraints missing in this SDP formulation are the non-convex rank-3 constraint 
and the determinant constraints det(Gij) = 1 on the Gram matrix G. However, the 
solution G to (4.9) is not unique. Note that if a set of rotation matrices {Ri} is the 



solution to (3.7), then the set of reflected rotation matrices {JRiJ} is also the solution 



to (3.7), where the matrix J is defined as 




Thus another solution to (4.9 1 is the Gram matrix G J 



(G J 



-l,— ,k with the 2x2 

sub-blocks given by Gfj_=JRj JJRjJ = JRJRjJ. It can be verified that \{G + G J ) 



ijjt,3 



is also the solution to (4.9). Using the fact that 



\{Gij 



G^) - 



Gij 





the problem (4.9) is reduced to 



mm 



Guc 



l 3 ' 



s.t. Gu = h 



(4.10) 



This is a SDR for the LUD problem (3.6). The problem (4.10) can be solved using 



alternating direction method (see details in section 6.2) 



4.4. The Randomized Rounding Procedure. The matrix R is recovered 
from a random projection of the solution G of the SDP (4.6). We randomly draw 
a 2K x 3 matrix P from the Stiefel manifold V3(M. 2K ). The random matrix P is 
computed using the orthogonal matrix Q and the upper triangular matrix R from 
QR factorization of a random matrix with standard normal entries, that is, P = 
Q sign (diag (R)), where sign is the sign function and diag(i?) is an diagonal matrix 
whose diagonal entries are the same as those in the matrix R. The matrix P is shown 
to be drawn uniformly from the Stiefel manifold in |17j . We project the solution G 
onto the subspace spanned by the three columns of the matrix GP We therefore 
expect to be able to recover the first two columns of the rotation matrices R\ , . . . , Rk 
from the three column vectors u 1 , u 2 and u 3 of the the first 3 columns of an unitary 
matrix U, where U is from singular value decomposition (SVD) of the 2K x 3 matrix 
GP. Note that it is possible to recover the molecule only up to a global orthogonal 
transformation, that is, up to rotation and possibly reflection, since the unitary matrix 



2 The 3 dimensional subspace can also be spanned by the eigenvectors associated to the top three 
eigenvalues of G. In addition, the fourth largest eigenvalue is expected to be nearly zero. See more 
details in [29]. 
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U is not uniquely determined from SVD of the matrix GP. The recovery of rotation 
matrices is performed by constructing for every i = a3 x 3 matrix 



At 



A] A 2 A? 



whose columns are given by 



-4 



l 2i-X 



Ai 



"2, 



, A| = cross (A},A? 



In practice, due to erroneous common lines, the vectors A\ and Af are approximately 
two orthonormal vectors with unit norm. Define 3x2 matrices constructed from Ai 
and Ri as 



A™ = 



A} Af 



and R [ r 2] = 



Ri Ri 



m 2] M 21 

We estimate the matrix R\ ' J as the closest matrix to A\' 1 on the Stiefel manifold 
V2(K 3 ) in the Frobenius matrix norm. This is done via the well-known procedure pQ. 
Rf ^ = UiV?, where Af' 2] = t/^Vf is the singular value decomposition of A\ . 



We note that except for the orthogonality constraint (4.7 1, the semidefinite program 



( 4.6 H 4.8 1 is identical to the Goemans- Williamson SDP for finding the maximum 



cut in a weighted graph , where the SDR and the randomized rounding procedure 
[311 115) for maximum cut problem is proved to have a 0.87 performance guaranty. 
From the complexity point of view, SDP can be solved in polynomial time to any 
given precision. The idea of using SDP for determining image orientations in cryo- 
EM was originally proposed in [29] . 

5. The Spectral Norm Constraint. However in our numerical experiments 
(section|8]), we observed that when the detected common-lines have a lot of "out-liers", 
the viewing directions of images estimated by solving either (4.6)-(4.8) or (4.10) are 



highly clustered (Figure 5.1). In other words, instead of being a global assignment 
of rotations in SO(3) for the images orientation, the solution to (4.6 )-p~8| or (4.10) 



tends to be a global assignment of 2D orthogonal transformation in 0(2) which is 
a combination of in-plane rotations and reflections. In order to prevent such ten- 
dency, we add another constraint on the spectral norm of the Gram matrix G to the 



optimization problem (4.6)-(4.8) or (4.10). 



or equivalently 



G ^aK ■ I 



\\G\\ 2 <aK, 



(5.1) 



(5.2) 



where ||G|| 2 is the spectral norm of the matrix G, and | < a < 1 controls the spread 
of the viewing directions. If the true image orientations are uniformly sampled from 
the rotation group SO(3), then by the law of large number and the symmetry of the 
distribution of orientations, we can prove the spectrum norm of the true Gram matrix 



Blue point: projection 

direction of an image a=0.7 a=0.9 




Fig. 5.1: The dependency of the spectral norm of G (denoted as aK here) on the 
distribution of orientations of the images. Here K = 100. The larger a is, the more 
clustered the orientations are. 



Gtrue is approximately |if . If the true viewing directions are highly clustered, then 
the spectral norm of the true Gram matrix Gtme is approximately K. For a known 
distribution of orientations, we can compute the spectrum norm of the true Gram 
matrix Gtme accordingly, which can be verified to be a number between | and 1. To 
prevent a solution of clustered viewing angles, we fix a to some number satisfying 
| < a < 1. 

6. The Alternating Direction Method (ADM) for the problems with 
the Spectral Norm Constraint. Here we generalize the application of ADM to 
SDP problems in [JO] to the application to the problems raised in the previous sections. 
ADM is a multiple-splitting algorithm that minimizes the dual augmented Lagrangian 
function in an alternating fashion such that in each step it firstly minimizes the 
Lagrange multipliers, then the dual slack variables, and finally the primal variables. 
In addition, in the minimization over a certain variable, the other variables are kept 
fixed. 

6.1. ADM for the relaxed weighted LS problem with the Spectral 



Norm Constraint. The weighted LS problem after SDR ( 4.6 1-( 4.8 1 can be effi 



ciently solved using SDPLR [3J. However, SDPLR is not suitable for the problem 



after the spectral norm constraint on G (5.2 1 is added to (4.6)-(4.8). This is because 



the constraint (5.2 1 can be written as aKI — G >p where aKI — G does not have a 
low rank structure. Moreover, SDP solvers using polynomial-time primal-dual interior 
point methods are designed for small to medium sized problems. Therefore they are 
not suitable for our problem. Here we devise a ADM for this problem which takes 



advantage of the low-rank property of G. After the spectral norm constraint (5.2) is 
added, the problem becomes 



mm 

G^o 



where 



-(C,G) (6.1) 

s.t. A(G) = b (6.2) 

\\G\\ 2 <aK (6.3) 

G\l 

A{G)= I Gf i ,b = I 6? I , (6.4) 



2 "ii "•" 2 "ii / i=l,2,...,JV \ "* / »=1,2 N 
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b\ = b 2 = l,bf = for all i, 



Gfj denotes the (p, q) th element in the 2x2 sub-block Gy, C — Wo S is a symmetric 
matrix and the product (G, G) = trace (GG). Following the equality (A(G) , y) = 



(G,.4*(y)) for arbitrary y 
defined as 



, the adjoint of the operator .4 is 



i=l,2 iV 



-A* (y) = Y 



yll yl2 
y21 y22 



where for i = 1, 2, . . . , N 

Y 11 - n 1 Y 22 - ii 2 and Y 12 - Y 21 - n 3 



It can be verified that A A* = I. The dual problem of problem ( 6. 1 )-( 6.3 1 is 
max min - (C, G) - (y, A (G) — b) — (G, X) . 

y,X>pO ||G|| 2 <Q-R" 



By rearranging terms in (6.5), we obtain 



max min - (G + X + A* (y) , G) + y 1 b. 

y,X^0 \\G\\ 2 <aK 



(6.5) 



(6.6) 



Using the fact that the dual norm of the spectral norm is nuclear norm (Proposition 
2.1 in |25J, we can obtain from (6.6) the dual problem 



max y T b- aK\\C + X + A* (y)|| 



(6.7) 



where denotes the nuclear norm. Introducing a variable Z = C + X + A* (y) . 
we obtain from (6.7| that 



min -y T b + aK\\Z\l 
s.t.Z = C + X + A* (y). 



(6.8) 
(6.9) 



Since Z is a symmetric matrix, \\Z\\^ is the summation of the absolute value of the 
eigenvalues of Z. The augmented Lagrangian function of (6.8)-(|6.9|) is defined as 



£(y,Z,X,G) =-y T b + aK\\Z\\^ + (G, G + X + A* (y) - Z) 



t\\C + X + A* (y)-Z\\ 2 F . 



(6.10) 

where /i > is a penalty parameter. Using the augmented Lagrangian function 
(6.10), we devise an ADM that minimizes (6.10) with respect to y, Z, X, and G in 
an alternating fashion, that is, given some initial guess, in each iteration the following 
three subproblems are solved sequentially: 



y fe+1 =argmin£(y,Z fc ,X fe ,G' £ ), 
y 

Z k+1 = argmm£ (y fc+1 , Z, X k , G k ) 



X 



fe+i 



arg min£(y fe+1 ,Z fe+1 ,X, G k ) 
10 



(6.11) 
(6.12) 
(6.13) 



and the Lagrange multiplier G is updated by 

G k+i = G k +1(1 (c + X k+l + A* (y fc+1 ) - Z k+1 ) , 

where 7 € ^0, 1+ 9 V ^ ^ is an appropriately chosen step length. 

To solve the subproblem (6.11 1, we use the first order optimality condition 

W y C(y,Z k ,X k ,G k ) = 

and the fact that AA* = I, and wc obtain 



(6.14) 



1 



= -A (C + X k - Z k ) - -{A (G) - b) . 

A* 

By rearranging the terms of C (y fc+1 , Z, X k , G k ), it can be verified that the sub- 
problem (6.12| is equivalent to 



min \\Z\\ 

z fl 



where B k = C + X k + A* (y fc+1 ) + ^G k . Let B k = UAU T be the spectral de- 
composition of the matrix B k , where A = diag (A) = diag (Ax, ■ • ■ , ^2k) ■ Then 
Z k+1 — [/diag (z) U T , where z is the optimal solution of the problem 



• aK 11 1 
mm 1 1 z I 

z fl 



x^ll-A" 2 



2 • 



(6.15) 



It can be shown that the unique solution of (6.15 1 admits a closed form called the soft- 
thresholding operator, following a terminology introduced by Donoho and Johnstone 
it can be written as 

= fo, if |A,-| <aK/fi 

Zl otherwise. 



The problem (6.13) can be shown to be equivalent to 



min \\X - H k \\~ , s.t. X ^ 0, 



x 



where H 



Z k+i - c - A* (y k+1 



i(? fc . The solution X k+1 = V+E+Vl is the 



Euclidean projection of H k onto the semidefinite cone (section 8.1.1 in [5]), where 

vsv T = ( v+ y_ 



x+ 
s_ 



V T 



is the spectral decomposition of the matrix H k , and £+ and are the positive and 
negative eigenvalues of H k . 

It follows from the updating equation (6.14) that 

G k+1 = (1 - j)G k + 7/1 (c + X k+1 + A* (y k+1 ) - Z k+1 + 
= (1 - j)G k + yn (X k+1 - H k ) . 
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6.2. ADM for the relaxed LUD problem with the Spectral Norm Con- 
straint. Consider the LUD problem after SDR: 



mm 



(6.16) 



where G, .4 and b are defined in (4.3 1 and (6.4) respectively. The ADM devised 
to solve (6.16) is similar to and simpler than the ADM devised to solve the one 
with the spectral norm constraint. We focus on the more difficult problem with the 
spectral norm constraint. Introducing Xy = cfj- — G^-c^ and adding the spectral norm 
constraint ||G|| 2 < aK, we obtain 

minV llx^-ll s.t. A (G) = b, xy = c£ - C,/,,. \\G\\ 2 < aK. (6.17) 



The dual problem of problem (6.17) is 

a „^TV R .E(ll X ^l|-(^' X ^-^+ G ^))-(y^( G )- b )-( G ^)- 

(6.18) 



By rearranging terms in (6.18) , we obtain 

max min - (Q (0) + X + A* (y) , G) + y T b 

6»ij,y,Jf^0x i:j ,||G|| 2 <Qi<" 



where 9 — (dij) 



31 fl2 



Q{0) 



i / g 11 (0) g 12 (e) 



2 1 g 21 (©) g 22 (6) 



'iji w ij ) ' ^ij 



and g M (0) 



(c 1 c 2 ) 



K1"1K ^K2 W 2K 



for p, q = 1, 2. It is easy to verify that for 1 < i < j < K 



(6.19) 



OP r l 



mi:a(||xy|| - (Oi^Xij)) = 



if \\8ijW <1 
otherwise. 



(6.20) 



In fact, (6.20) is obtained using the inequality 

||Xy|| - (di^Xij) = \\XijW - \\0 i:j \\ \\XijW (dij/ \\0ijW ,Xij/ \\x 



>\\xi i \\-Pii\\\\*ij\\ = 0--\\Oi. 



(6.21) 



and the inequality in ( |6.21 ) holds when 0^ and Xy have the same direction. Using 

the fact that the dual norm of the spectral norm is nuclear norm and the fact in 
(6.20), we can obtain from ( 6.19[ ) the dual problem 

min -y T b-J2(°«^j)+<xK\\Z\l (6-22) 

i<j 

s.t. Z = Q(9) + X + A* (y) , and ||0y|| < 1. (6.23) 
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The augmented Lagrangian function of problem ( 6.22 )-( 6.23) is defined as 



C (y, 6, Z, X,G)=- y T b + aK \\Z\l - ]T (fly, eg) + (G, Q (6) + X + A* (y) - Z) 

(6.24) 



+ |||g(0) + X + ^ (y)-zf F , 

for ll^y ll < 1, where /i > is a penalty parameter. Similar to section 6.1 using the 
augmented Lagrangian function (6.24), ADM is used to minimize (6.24) with respect 



to y, 8, Z, X , and G alternatively, that is, given some initial guess, in each iteration 
the following four subproblems are solved sequentially: 



r k+1 = arg min £ (y, 6 k , Z k ,X k , G 



k + l = arg „ min C (y k+ \0, Z k ,X k , G k ) , 



il<i 



Z k+1 = arg min C ( y k+ \0 k+ \Z, X k ,G k 



X k+1 = arg mm £ (y fe+1 , fe+1 , Z k+ \ X, G k 



and the Lagrange multiplier G is updated by 

G k+1 = G k + 7 ^ (Q (fl fc+1 ) + X k+1 + A* (y fe+1 ) - Z 



k+l 



(6.25) 
(6.26) 

(6.27) 
(6.28) 

(6.29) 



where 7 6 ^0, 1+ 2 V ^ ^ is an approprately chosen step length. The methods to solve 

subpr oblems (6.25), (6.27) and (6.28 1 are similar to those used in (6.11), (6.12) and 
(6.13 To solve subproblem (6.26), we rearrange the terms of C (y k + T ^6 7 Z k ~X k , G k ) 
and obtain an cqivalcnt problem 



mm - 
On 



p, S.t. ||^ || < 1, 



where $ = X k +A* (y k+1 )-Z k + ^G k , $ = 



$11 $12 
$21 $22 



and <&,y = 



Problem (6.29) is further simplified as 
min (0 tf , //$;.,• of; -c^) 



^ll^-H 2 , s.t. ||%||<1, 



$12 

I J I J 

$21 $22 

13 V 



whose solution is 



< 1, 



otherwise. 



The convergence analysis and the practical issues related to how to take advan- 
tage of low-rank assumption of G in the eigenvalue decomposition performed at each 
iteration, strategies for adjusting the penalty parameter jj,, the use of a step size 7 for 
updating the primal variable X and termination rules using the in-feasibility measures 
are discussed in details in 1301. 
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7. The Iterative Reweighted Least Squares (IRLS) Procedure. The 

LUD problem (3.7 1 can also be solved via semidefinite relaxation as 



mm 

GeR 2Kx 



lK F(G)= E J 2 " 2 E ^ijS: 



s.t. 



,K, 



i,j=l,2 K 

Gu = h, i = 1,2 
G fc= 0, 

||G|| 2 < a A (optional) 



',9=1,2 



(7.1) 

(7.2) 
(7.3) 
(7.4) 



where a is a fixed number between \ and 1, and the spectral norm constraint on G 



(7.4) is added when the solution to the problem (7.1 1-(7.3) is a set of highly clustered 



rotations. Notice that this relaxed problem is not convex since the objective function 
(7.1) is concave. Thus a good initial guess of G, for example, an estimate of G using 



LS, is used to obtain an approximate solution of (7.1 )-(7.3) (possibly with (7.4)) by the 
IRLS procedure [HIH] described in algorithm [l Before the rounding procedure, the 



Algorithm 1 (the IRLS proce dure ) Solve optimization problem (7.1)-(7.3) (with 
the spectral norm constraint on G (7.4) if the input parameter a satisfies | < a < 1), 



and then recover the orientations by rounding. 

Require: a 2A x 2A common-line matrix S, a regularization parameter e, a param- 
eter a and the total number of iterations N 



G° = 0; 
for k = 1 



1 Vi,i = !,••■ ,K; 



N, step size = 1 do 



1-3 



update W by setting w. 

if | < a < 1, obtain G k by so lvin g the problem (6.1)-(6.3) using ADM; other 



wise, obtain G k by solving ( |4.6[ )-( |4.8[ ) using SDPLR (with initial guess G k 1 )\ 



= E 



-1,2 ^ij °ij 



K 



the residual r 
end for 

obtain estimated orientations R\ , . 
procedure in section [4. 4| 



Rk from G using the randomized rounding 



IRLS procedure finds an approximate solution to the optimization problem ( 7.1 )-( 7.3 ) 
(possibly with (7.4)) by solving its smoothing version 



C^, K F ^= E /2-2 E G^Sff + e? (7.5) 

.,; 1.2 K y P,<2=1,2 

s.t. Gu = I 2 , i = l,2,--,K, (7.6) 

G^O, (7.7) 
||G|| 2 < aK (optional). (7.8) 

where e > is a small number. In each iteration, the IRLS procedure solves a problem 
G k+1 = argminE w ^ (2-2 (Gij,Sij) + e 2 ) s.t. A(G) = b, (optional: ||G|| 2 < aK) 

(7.9) 
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on the (k + l)th iteration, where wfj = 1, and 

w% = l/ v /2-2(G^,^)+e 2 , Vfc > 0. 

In other words, we emphasize the detected common-lines that are explained well by 
the current estimated Gram matrix G k . The presence of the regularization e ensures 
that no single detected common-line can gain undue influence. 

G k+1 = argmax(W^ fc o S,G) s.t. A(G) = b (optional: ||G|| 2 < aK). (7.10) 

We repeat the process until the residual sequence {r k } has converged. We prove the 
following theorem. 

Theorem 7.1. Let {G fe } be the sequence generated by the IRLS procedure, then 

F{G k+ \e) < F(G k ,e). (7.11) 



Proof. Here we prove the theorem without the spectral norm constraint on G. 
The arguments can be generalized to the IRLS procedure with the spectral norm 
constraint. Since G k is the solution of (7.101, there exists y k € K 2 ™ and X k £ R2nx2n 
such that 



-A* (y fc ) + X k + W^ 1 oS = 0, A(G k ) - b = 0, 
G k 0, X k 0, (G k ,X k ) = 0. 

Hence we have 

= -(y k f{A(G k ) - b) + (y k+1 ) T (A(G k+1 ) - b) 
= (y fe+1 - y k ) T (A{G k ) - b) + (X(y fc+1 ), G k+1 - G k ) 
= (X k+1 + W k o S, G k+1 - G k ) 
< (W k oS,G k+1 -G fe ) 

= \ E H$ ( 2 - 2 ^\S tj ) + e 2 ) + 0* (2 - 2 (G k j,Sij) + , 2 )) 



(7.12) 
(7.13) 



-y 

2 ^ 



2-2(G^,^)+ £ 2 
^2-2{G%,S ij ) + e* 



+ ^2-2{G k ij ,S ij )+eA , 



(7.14) 



(7.15) 



where the third equality uses (7.12), and the inequality (7.14) uses (7.13). From (7.15) 
we obtain 



F(G k ,e) 2 



£^2 -2 + 
> (Ev /2 2 <^u) ( V 



i*7 



2 -2(G^,^)+ £ 2 
2- 2 (G* ,%>+e 2 



> (E v 2 - 2 ^ +1 '^) + e2 l -^(G fe+1 , e ) 2 , 



(7.16) 
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where the last inequality uses Cauchy-Schwarz inequality and the equality holds if 
and only if 



^2-2(G*+ 1 ,S ij )+. 
y j2-2(G!? j ,S ij )+e 



c for all i =/= j, 



(7.17) 



where c is a constant. Thus (7.11) is confirmed. □ 



In addition, using Holder's inequality, the analysis can be generalized to the 
reweighted approach to solve 



min 



E 



(2-2 (Gi^Sij))' s.t. A(G) = b, (optional: ||G|| 2 < aK) 



where < p < 1. The problem ( |7.18[ ) is a SDR of the problem 
min \\R i (cij,0y r - Rj (cji,0) T 



(7.18) 



(7.19) 



The smaller p is, the more penalty the outliers in the detected common-lines receive. 

8. Numerical results. All numerical experiments were performed on a machine 
with 2 Intel(R) Xeon(R) CPUs X5570, each with 4 cores, running at 2.93 GHz. We 
simulated 100 centered images of size 129 x 129 pixels of the 50S ribosomal subunit 



with SNR=l/32 and 1/64 respectively (Figure 8.1). The noise added to the images 
is white Gaussian. The pixel size is 2.4A. The orientations of the images are sampled 
from the uniform distribution over SO (3). The polar Fourier transform have radial 
resolution n r — 100 and angular resolution ng — 360. Common-line pairs that are 
detected with an error smaller than 10° are considered to be correct. The common- 
line detection rates are 44% and 23% for images with SNR=l/32 and SNR=l/64, 
respectively (Figure 3.1). 

We define the mean squared error (MSE) of the estimated rotation matrices 
i?i , . . . , Rk a s 



1 K ii 
MSE= - ^ R 



OR, 



where O is the optimal solution to the registration problem between the two sets of 
rotations {R\, . . . , Rk} and |i£i, . . . , Rk\ in the sense of minimizing the MSE. As 
shown in [25], there is a simple procedure to obtain both O and the MSE from the 
singular value decomposition of the matrix X)i=i RiRf ■ 

When SNR=l/32, the common-line detection rate is relatively high (44%), the 
spectral constraint on G is not needed. The LUD approach using ADM and IRLS 
outweighs the LS approach in terms of accuracy (Figure 8.2 and Table [8Ta I . When 
SNR=l/64, the common-line detection rate is relatively small (23%), the spectral 
constraint on G improves the accuracy of estimations for all algorithms (T able |8. lb ). 



The LUD approach is better than the LS approach (Figure 8.2 and Table 8.1b[ ). In 
addition, the LUD approach using ADM is less sensitive to the choice of the parameter 
a than the one using IRLS (Table |8.1b| . We also observed that increasing the number 
of images helps little in improving the MSEs. 
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Fig. 8.1: The top row shows four clean images of size 129 x 129 pixels generated from 
a 50S ribosomal subunit volume with different orientations. The middle and bottom 
rows show four noisy images corresponding to those in the top row with SNR=l/32 
and 1/64, respectively. 
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Method 


MSE 


Time 


LS 


SDP 


0.36 


2s 


LUD 


ADM 


0.14 


19s 


IRLS 


0.06 


23s 



(a) SNR = 1/32 



Method 


a 


MSE 


Time 




SDP 


NA 


1.09 


2s 






2/3 


0.74 


12s 


LS 


ADM 


3/4 


0.75 


13s 






4/5 


0.82 


17s 






NA 


2.01 


lis 




ADM 


2/3 


0.69 


42s 






3/4 


0.67 


42s 






4/5 


0.66 


45s 


LUD 




NA 


2.12 


48s 




IRLS 


2/3 


0.56 


432s 






3/4 


0.66 


267s 






4/5 


0.74 


295s 



(b) SNR = 1/64 



Table 8.1: The MSEs 
algorithms. 



1 1 of the estimated rotations and the cost time using different 



9. Discussion. To estimate image orientations, we introduced a robust self con- 
sistency error and used ADM or the IRLS procedure to solve the associated LUD 
problem after SDR. Numerical experiments demonstrate that the solution is less sen- 
sitive to outliers in the detected common-lines than the LS method approach. In 
addition, when the common-line detection rate is low, the spectral norm constraint 
on the Gram matrix G introduced helps tighten the semidefinite relaxation, and thus 
improves the accuracy of the result. We note that it is also possible to consider other 
self consistency errors involving the unsquared deviations raised to some power p 
(e.g., the cases p = 1, 2 correspond to LUD and LS, respectively). We observed that 
the accuracy of the estimated orientations can be improved by using p < 1 provided 
that the initial guess is "good enough". The LUD approach and the spectral norm 
constraint on G can be generalized to the synchronization approach to estimate the 
images' orientations in |27j . It is speculated that under some condition, the exact 
recovery of the orientations can be achieved using the LUD approach as discussed in 
our another paper on synchronization over rotation groups |39j . 
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